Binaural Cues for Fragment-Based Speech Recognition in Reverberant Multisource Environments
نویسندگان
چکیده
This paper addresses the problem of speech recognition using distant binaural microphones in reverberant multisource noise conditions. Our scheme employs a two stage fragment decoding approach: first spectro-temporal acoustic source fragments are identified using signal level cues, and second, a hypothesisdriven stage simultaneously searches for the most probable speech/background fragment labelling and the corresponding acoustic model state sequence. The paper reports the first successful attempt to use binaural localisation cues within this framework. By integrating binaural cues and acoustic models in a consistent probabilistic framework, the decoder is able to derive significant recognition performance benefits from fragment location estimates despite their inherent unreliability.
منابع مشابه
Recent advances in fragment-based speech recognition in reverberant multisource environments
This paper addresses the problem of speech recognition using distant binaural microphones in reverberant multisource noise conditions. Our scheme employs a two stage fragment decoding approach: first spectro-temporal acoustic source fragments are identified using signal level cues, and second, a hypothesisdriven stage simultaneously searches for the most probable speech/background fragment labe...
متن کاملBinaural deep neural network classification for reverberant speech segregation
While human listening is robust in complex auditory scenes, current speech segregation algorithms do not perform well in noisy and reverberant environments. This paper addresses the robustness in binaural speech segregation by employing binary classification based on deep neural networks (DNNs). We systematically examine DNN generalization to untrained configurations. Evaluations and comparison...
متن کاملMask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments
This work presents an automatic speech recognition system which uses a missing data approach to compensate for environmental noise. The missing, noise-corrupted components are identified using binaural features or a support vector machine (SVM) classifier. To perform speech recognition using the partially observed data, the missing components are substituted with clean speech estimates calculat...
متن کاملA hearing-inspired approach for distant-microphone speech recognition in the presence of multiple sources
This paper addresses the problem of speech recognition in reverberant multisource noise conditions using distant binaural microphones. Our scheme employs a two-stage fragment decoding approach inspired by Bregman’s account of auditory scene analysis, in which innate primitive grouping ‘rules’ are balanced by the role of learnt schema-driven processes. First, the acoustic mixture is split into l...
متن کاملBinaural Reverberant Speech Separation Based on Deep Neural Networks
Supervised learning has exhibited great potential for speech separation in recent years. In this paper, we focus on separating target speech in reverberant conditions from binaural inputs using supervised learning. Specifically, deep neural network (DNN) is constructed to map from both spectral and spatial features to a training target. For spectral features extraction, we first convert binaura...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011